A (very) basic introduction Python in Jupyter notebooks

The purpose of this notebook is to get you started with using Python in Jupyter notebooks. This notebook is an introduction to using Python in a notebook with pandas, a data analysis package, and matplotlib, a plotting package.

This notebook was originally created for a Digital Mixer session at the 2016 STELLA Unconference

Importing packages

To use any functions in a package you must first import the package or the parts of the package you want to employee. For example, below are two different examples of importing. Here is the breakdown of what is happening:

First, we are importing the entire pandas package using:

import pandas

Next, we are changing the name of the pandas package in our current program to pd:

import pandas as pd

Doing this saves us a little time in the future as anytime you want to use a function in the pandas package you only have to type pd instead of the long word pandas.

For the next import we only want part of the matplotlib library. Specifically, we want pyplot, a graphical plotting framework. To import only part of a package we first the keyword from to indicate which package we want to take from and then import that subsection of code:

from matplotlib import pyplot

When this cell is run pandas will be imported and available as pd and pyplot from the matplotlib package will be imported.


In [33]:
# import necessary objects
import pandas as pd
from matplotlib import pyplot

Code helpers in Jupyter

Jupyter has some built-in features to help you with programming. Two helpful features are code completion and tool-tips.

Code completion

In the Python code cell below type the following and then press tab:

pd.Da

You will see a popup box indicating all the functions in pandas (pd) that start with the letters 'Da'. This is the code completion tool. If you ever can't quite remember the name of a function or want to quickly type a function out this can be handy. Go ahead and select DataFrame from the list of available functions

Tool-tips

Tool-tips provide documentation on parts of our code. For example, if you want to know what DataFrame is and how to use it we can activate a tool-tip. To activate a tool-tip first click in the text of DataFrame in the code cell below then hold down shift and press tab. A popup should appear giving information about what type of data goes into the DataFrame function and a brief explantation of what this function does.

For even more information, with your curser still in the text of DataFrame, hold down shift and press tab twice. Now you are presented with a larger, scrollable popup with more detailed documentation on DataFrame.

For an entire pop-out of this documentation, following the same pattern above, hold shift and press tab four times. This will open the document on DataFrame in another pane.


In [ ]:
# test out code completion and tool-tips here
pd.DataFrame

# after testing code completion and tool-tips, see if you can create a simple data frame

Pandas and Matplotlib

The following cells of code are simple examples of the pandas and Matplotlib packages. Try running each cell, using tool-tips, and editing the code to understand what these functions do.

This is nowhere near an intro to these packages. For an in-depth introduction to pandas try pandas-cookbook. For a brief intro to plotting in Jupyter notebooks check out Plotting with Matplotlib


In [ ]:
# create a dictionary of fruits and their counts
fruits = {"apples": 2, "oranges": 5, "bananas": 10, "kiwi": 4, "grapes": 30}

# use the dictionary to create a pandas dataframe
fruitData = pd.DataFrame({"fruits": fruits})

# show the dataframe in the output
fruitData

In [ ]:
# generate some summary statistics on the dataframe
fruitData.describe

In [ ]:
# special command to make plots appear inline
%matplotlib inline
# make plots in the design style of ggplot
pyplot.style.use('ggplot')

# create a bar graph showing the number of each fruit
fruitData.sort_values("fruits").plot.barh()